AI-Based Stamp Detection

In this project, an AI-based system was developed to recognize stamps in PDF documents. The process consisted of two main steps: First, the stamp was localized in the document, then the stamp was identified. This two-step approach enabled precise and efficient processing of PDF documents, especially for applications in document management and the automation of administrative processes.
Process: From localization to identification #
The multi-stage process is divided into two main steps:
- Localization of the stamps in the PDF document: This involved determining the position of the stamp in the document.
- Identification of the stamp: Once the position of the stamp was found, it was determined which stamp it was.
This approach allows robust recognition, even if the stamps occur in different sizes and rotations.
Data basis and data augmentation #
Only a small amount of sample data was available for the development of the ML models. This posed a challenge, as there was not enough data available for training machine learning models. To overcome this hurdle, synthetic data was generated. Based on the provided examples, a set of synthetic stamps as well as PDF documents with these stamps were created.
The generation of synthetic stamps made it possible to train the model on a large number of possible stamp variants without having to rely on a large amount of real data. This step increased the variance in the training data and improved the generalization capability of the model.
Part of the generation was to make the stamps look as realistic as possible by artificially aging and blurring them. Augmentation was also used to increase diversification. The stamps were rotated, distorted and discolored.
Models used #
1. localization of the stamps: YOLO #
The YOLO model was used to localize the stamps in the document. YOLO (You Only Look Once) is a state-of-the-art object detection model that is particularly suitable for real-time applications. It was able to efficiently recognize the stamps in the document and determine the exact positions of the stamps.
2. identification of the stamps: Siamese neural network #
A Siamese neural network was used to identify the specific stamp. A Siamese network is a special architectural concept used to determine similarities between two inputs. Two identical networks process two different inputs, and the distance between the generated feature vectors indicates how similar the two inputs are.
This approach is particularly useful for classification problems where there are many different classes but not enough training data for each class. In this case, the Siamese network allowed the stamp to be accurately identified by matching it against an existing stamp catalog.
Conclusion #
By using modern deep learning methods, a robust system for stamp recognition in PDF documents could be developed. The combination of synthetic data, a YOLO model for localization and a Siamese network for classification enabled an efficient and accurate solution that is ideally suited for the automation of document processes.
Activities #
- Design and generation of synthetic stamp data (stamps, as well as stamped PDF documents); developed in the form of Python scripts
- Execution of several trainings of a YOLOv3 on the stamp data to localize the stamps
- Architectural design and implementation of a Siamese network to identify the stamps stamps
- Execution and evaluation of several experiments to identify the best AI approach
- Presentation of the results